Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise queries search for a chain of OR strings #3250

Merged
merged 4 commits into from
Mar 5, 2019
Merged

Conversation

ironage
Copy link
Contributor

@ironage ironage commented Mar 2, 2019

This is a performance enhancement motivated by users who are generating queries with many string comparisons on a single column, for example from cocoa's "IN" queries; see https://github.com/realm/engineering/issues/22

The idea is to combine string equality conditions from a single "OR" query node and store them in an unordered_set. With N elements to search, and C conditions, the runtime changes from O(N*C) to O(N). The added benchmark goes from 30 seconds to 2 seconds. This change does not try to optimise indexed columns which should be running O(log(N)*C). The benchmark with indexes turned on runs in 3.5 seconds. Since N is likely the dominant term, using indexes should still be fastest in practice when compared to this optimisation.

First make a sort based on column id. This will allow us to only pass through the
array of conditions once.
Comparing column indecies before making dynamic casts.
Early out if column has search index.
Copy link
Contributor

@finnschiermer finnschiermer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, but see comment.

auto it = m_conditions.begin();
while (it != m_conditions.end()) {
// Only try to optimize on StringNode<Equal> conditions without search index
if (bool(*it) && (first = dynamic_cast<StringNode<Equal>*>(it->get())) && !first->has_search_index()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sligtly confused by this bool(*it) ... if it is necessary here, then how come we don't need it before dereferencing "next" in line 1691 ?

Perhaps some simplification is possible?

Copy link
Contributor Author

@ironage ironage Mar 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I removed the check on bool(*it), it should be covered by the dynamic cast anyway.

@ironage
Copy link
Contributor Author

ironage commented Mar 4, 2019

Here's the actual benchmark results for reference (100000 rows, 1000 query conditions):

Before:

QueryChainedOrStrings (MemOnly, EncryptionOff):   min 353.80ms     max 497.77ms     median 399.02ms     avg 394.69ms     stddev  44.38ms
QueryChainedOrStrings (MemOnly, EncryptionOn):    min 351.54ms     max 456.18ms     median 368.16ms     avg 375.97ms     stddev  32.92ms
QueryChainedOrStrings (Full   , EncryptionOff):   min 350.13ms     max 360.58ms     median 351.87ms     avg 352.84ms     stddev   3.24ms
QueryChainedOrStrings (Full   , EncryptionOn):    min 350.49ms     max 400.45ms     median 352.64ms     avg 357.29ms     stddev  15.26ms

test/benchmark-common-tasks/realm-benchmark-common-tasks  28.96s user 0.22s system 94% cpu 30.743 total

After:

QueryChainedOrStrings (MemOnly, EncryptionOff):     min      5ms (-98.59%)           max   7.27ms (-98.54%)           med   5.77ms (-98.55%)           avg   5.81ms (-98.53%)           stddev   658us (-98.52%)
QueryChainedOrStrings (MemOnly, EncryptionOn):      min   4.92ms (-98.60%)           max   5.77ms (-98.74%)           med   5.19ms (-98.59%)           avg   5.20ms (-98.62%)           stddev   242us (-99.26%)
QueryChainedOrStrings (Full   , EncryptionOff):     min   4.87ms (-98.61%)           max   5.38ms (-98.51%)           med   4.95ms (-98.59%)           avg      5ms (-98.58%)           stddev   134us (-95.88%)
QueryChainedOrStrings (Full   , EncryptionOn):      min   4.93ms (-98.59%)           max   5.43ms (-98.64%)           med   5.21ms (-98.52%)           avg   5.20ms (-98.54%)           stddev   158us (-98.97%)

test/benchmark-common-tasks/realm-benchmark-common-tasks  0.81s user 0.16s system 42% cpu 2.271 total

@tgoyne
Copy link
Member

tgoyne commented Mar 4, 2019

What's the difference like with two conditions? I'd expect this to be slower for a sufficiently small number of conditions, but whether or not that's anything worth caring about depends on how much slower it is and where the break even point is.

@ironage
Copy link
Contributor Author

ironage commented Mar 5, 2019

It's a good point, there is an overhead cost of computing a string hash. The majority of time is spent computing our custom StringData hash (not sure how performant it tries to be). That being the case, I added a simple loop check to iterate through the conditions and check for matches without hashing anything. Based on the following tests, I found that the threshold for choosing this over the hash is around 20 conditions.

(the axis units are: milliseconds vs number of conditions)

  no optimisations unordered_set (hashing) list search
1      
2 0.7 1.8 0.3
3 1 3.5 0.4
4 1.4 4 0.7
5 1.7 4 0.8
10 5.4 4 1.9
20 10.5 4 3.5
30 16 3.6 5
50 25.5 3.2 8.5

screenshot 2019-03-04 at 17 15 18

@jedelbo jedelbo merged commit 993363c into master Mar 5, 2019
@jedelbo jedelbo deleted the js/or-chains branch March 5, 2019 10:12
jedelbo pushed a commit that referenced this pull request Mar 6, 2019
This is a performance enhancement motivated by users who are generating queries with many string
comparisons on a single column, for example from cocoa's "IN" queries.

The idea is to combine string equality conditions from a single "OR" query node and store them in an
unordered_set. With N elements to search, and C conditions, the runtime changes from O(N*C) to
O(N). The added benchmark goes from 30 seconds to 2 seconds. This change does not try to optimise 
indexed columns which should be running O(log(N)*C). The benchmark with indexes turned on runs in 
3.5 seconds. Since N is likely the dominant term, using indexes should still be fastest in practice when
compared to this optimisation.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants